Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output cleanup -- release as many file resources as possible when they are no longer needed #722

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

hellkite500
Copy link
Member

Large scale ngen runs use a significant amount of file handles between the model engine, the formulation init files, and the routing routine. The current mechanics don't close and release file handles (or any other resource, really) between the rainfall-runoff model steps and the routing steps. This PR provides mechanisms for releasing resources when they are no longer needed, and should dramatically reduce the number of open file handles used throught the life cycle of a fully integrated routing run.

Additions

  • Manager and container public functions for releasing shared resources.

Removals

Changes

  • ngen main now releases formulation resources before launching routing.

Checklist

  • PR has an informative and human-readable title
  • Changes are limited to a single goal (no scope creep)
  • Code can be automatically merged (no conflicts)
  • Code follows project standards (link if applicable)
  • Passes all existing automated tests
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future todos are captured in comments
  • Project documentation has been updated (including the "Unreleased" section of the CHANGELOG)
  • Reviewers requested with the Reviewers tool ➡️

Target Environment support

  • Linux
  • MacOs

Copy link
Contributor

@program-- program-- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few comments, mostly to clarify my understanding

virtual ~FileStreamHandler() override {
auto tmp = std::static_pointer_cast<std::ofstream>(output_stream);
if(tmp){
tmp->close();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is needed, unless we were checking the failbit. When output_stream's lifetime ends, its destructor will be called, so if it's an ofstream it'll close the file anyways.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not remembering exactly why I did this (that's what I get for leaving it sitting in my working tree for so long 😵‍💫). I think you are correct, and this may be an artifact of some early attempts to ensure the files were were closed but due to the shared references of formulations they were not and it took me a while to sort that out haha. I'll take a closer look and remove this if it isn't being useful.

Comment on lines +550 to +551
features.release_formulations();
manager->clear();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm not understanding it, what's the reason for using shared pointers if we're going to handle manually decrementing the reference counts anyways? It would make more sense to put features and manager within an RAII block so they go out of scope before needing to call these.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jumping around here in my responses, but they are all somewhat related.

Formulations exist as shared pointers because we want to ensure we only create one formulation per feature and reuse that formulation in different contexts.

The formulation manager (which superseded the network indexing I'll discuss momentarily) exists entirely to abstract and encapsulate the parsing, configuring, and instantiation of formulations, i.e. ready to execute models associated with some feature.

It does this reasonalby well, and ultimately creates a map of feature_id -> Formulation. Each formulation must be instantiated and ready to "run" before be we can begin using them in the modeling steps, so this container does just that. The hy_features container, then, is responsible for aligning the topology and network with these formulations. It uses the constructed formulations from the manager and "ties" them to the HY_Catchment realization. The network uses feature indexing and maps to provide the indirection back to to the formulation when used in the actual computational steps...

I'm not sure if that helped in understanding or not. I can try to articulate that more completely if needed. To your point on using RAII blocks...yes, that would be one solution. The functions to explicitly release the resources by clearing the underlying maps that hold the references are another. Having these functions provides some additional flexibility beyond relying on the RAII blocks, though. For example, a Network still has a valid feature index that may be useful/needed without having a need for the model formulation associated with the features indexed by the network (the complete HY_Catchment "realization"). Similarly, the manager actually manages a little bit more than just formulations -- it manages time and other configurations that have utility without require references to the model formuations for each catchment.

@@ -539,6 +539,17 @@ int main(int argc, char *argv[]) {

} //done time

//Close down any file handles we have
for(auto& f : nexus_outfiles){
f.second.close();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to below, nexus_outfiles should be moved into the function and placed within an RAII block, so the destructor can handle it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nexus output handling is still pretty rough, there's been a long standing TODO to refactor it -- but its been pretty low priority.

virtual ~HY_CatchmentArea();

protected:

polygon_t bounds;

utils::StreamHandler output;
std::unique_ptr<utils::StreamHandler> output;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to use pointers anyways, why not just use std::ostream classes directly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StreamHandler is a bit of an abstract wrapper around ostream...the original intent was the ability to have an abstract output interface that was easily connected to existing mechanics (like ofstream for csv/file output) but also be able to implement more complicated output mechanics, such as writing to NetCDF, using the same stream operator/semantics and have a simple way of reconfiguring and switching between these using the base StreamHandler interface.

This could definitely evolve and change.

I think the reason I moved these to unique pointers was to provide some protection against pass by value copies, but I could be wrong on that motivation. I know I definitely wanted to ensure that a given formulation/realization was only ever associated with a single output resource that was tightly tied to it its lifecycle.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an aside to actually reviewing this PR, I'm pretty sure we should just make StreamHandler go away.

@PhilMiller
Copy link
Contributor

A lot of the resource releasing logic in here might fit in well with the finalize() interface that I wrote into #819.

As to RAII, I think we should probably aim for explicit high-level resource releases, and maybe embrace RAII-driven destruction in lower-level components where there's no need for distinction between release from destruction. My experience is that it can become way too difficult to arrange scopes and implicit lifetimes to fit initialization, the desired forward execution, and the necessary release/shutdown order. Ideally, everything would be a nice LIFO stack, but dependencies often don't work out that way.

@PhilMiller
Copy link
Contributor

As an additional note on explicit release, going down that path also means that there's a point in the program where we can internally assert various checks that resource usage is down as we expect, as a guard against things that we don't leak, but shouldn't have retained.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants